AITopics | quantized neural network

Collaborating Authors

quantized neural network

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

SearchingforLow-BitWeightsin QuantizedNeuralNetworks

Neural Information Processing SystemsFeb-7-2026, 22:14:04 GMT

However, the quantization functions used in most conventional quantization methods are non-differentiable, which increases the optimization difficulty ofquantized networks. Compared with full-precision parameters (i.e.,32-bit floating numbers), low-bit values areselected from amuch smaller set. For example, there are only 16 possibilities in 4-bit space. Thus, we present to regard the discrete weights in an arbitrary quantized neural network as searchable variables, and utilize a differential method to search them accurately. In particular, each weight is represented as a probability distribution over the discrete value set. The probabilities are optimized during training and the values with the highest probability are selected toestablish the desired quantizednetwork.

artificial intelligence, cvpr, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Searching for Low-Bit Weights in Quantized Neural Networks

Neural Information Processing SystemsDec-23-2025, 21:37:59 GMT

Quantized neural networks with low-bit weights and activations are attractive for developing AI accelerators. However, the quantization functions used in most conventional quantization methods are non-differentiable, which increases the optimization difficulty of quantized networks. Compared with full-precision parameters (\emph{i.e.}, 32-bit floating numbers), low-bit values are selected from a much smaller set. For example, there are only 16 possibilities in 4-bit space. Thus, we present to regard the discrete weights in an arbitrary quantized neural network as searchable variables, and utilize a differential method to search them accurately. In particular, each weight is represented as a probability distribution over the discrete value set. The probabilities are optimized during training and the values with the highest probability are selected to establish the desired quantized network. Experimental results on benchmarks demonstrate that the proposed method is able to produce quantized neural networks with higher performance over the state-of-the-arts on both image classification and super-resolution tasks.

low-bit weight, name change, quantized neural network, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.78)

Add feedback

Quantum-Classical Hybrid Quantized Neural Network

Li, Wenxin, Wang, Chuan, Zhu, Hongdong, Gao, Qi, Ma, Yin, Wei, Hai, Wen, Kai

arXiv.org Artificial IntelligenceDec-9-2025

In this work, we introduce a novel Quadratic Binary Optimization (QBO) framework for training a quantized neural network. The framework enables the use of arbitrary activation and loss functions through spline interpolation, while Forward Interval Propagation addresses the nonlinearities and the multi-layered, composite structure of neural networks via discretizing activation functions into linear subintervals. This preserves the universal approximation properties of neural networks while allowing complex nonlinear functions accessible to quantum solvers, broadening their applicability in artificial intelligence. Theoretically, we derive an upper bound on the approximation error and the number of Ising spins required by deriving the sample complexity of the empirical risk minimization problem from an optimization perspective. A key challenge in solving the associated large-scale Quadratic Constrained Binary Optimization (QCBO) model is the presence of numerous constraints. To overcome this, we adopt the Quantum Conditional Gradient Descent (QCGD) algorithm, which solves QCBO directly on quantum hardware. We establish the convergence of QCGD under a quantum oracle subject to randomness, bounded variance, and limited coefficient precision, and further provide an upper bound on the Time-To-Solution. To enhance scalability, we further incorporate a decomposed copositive optimization scheme that replaces the monolithic lifted model with sample-wise subproblems. This decomposition substantially reduces the quantum resource requirements and enables efficient low-bit neural network training. We further propose the usage of QCGD and Quantum Progressive Hedging (QPH) algorithm to efficiently solve the decomposed problem.

artificial intelligence, machine learning, neural network, (17 more...)

arXiv.org Artificial Intelligence

2506.1824

Country: Asia (0.28)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.48)

Add feedback

Searching for Low-Bit Weights in Quantized Neural Networks Zhaohui Y ang

Neural Information Processing SystemsOct-2-2025, 13:18:39 GMT

However, the quantization functions used in most conventional quantization methods are non-differentiable, which increases the optimization difficulty of quantized networks. Compared with full-precision parameters ( i.e., 32-bit floating numbers), low-bit values are selected from a much smaller set. For example, there are only 16 possibilities in 4-bit space. Thus, we present to regard the discrete weights in an arbitrary quantized neural network as searchable variables, and utilize a differential method to search them accurately. In particular, each weight is represented as a probability distribution over the discrete value set. The probabilities are optimized during training and the values with the highest probability are selected to establish the desired quantized network.

artificial intelligence, machine learning, neural network, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Review for NeurIPS paper: Searching for Low-Bit Weights in Quantized Neural Networks

Neural Information Processing SystemsJan-22-2025, 23:03:29 GMT

Weaknesses: 1) The similar idea of learning an auxiliary differentiable network has also been introduced in the following paper. The main difference of this paper to the following reference is that multiple bits are learned for each code in this paper while, undoubtedly, binary weights and representations will be more cost-efficient. More importantly, authors did not discuss this similar reference. IJCAI, 2019 2) I am very confused with the EQ. According to EQ. (1), The values v are discrete numbers while p is probability that the elements in W belong to the i -th discrete value.

low-bit weight, neurips paper, quantized neural network, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.43)

Add feedback

Review for NeurIPS paper: Searching for Low-Bit Weights in Quantized Neural Networks

Neural Information Processing SystemsJan-22-2025, 23:03:22 GMT

The paper proposes a novel end-to-end gradient-based optimization for searching discrete low-bit weights in quantized networks. After reading the reviews, rebuttal, and the discussion among reviewers the paper clearly is recognized as novel and well executed. I would encourage the authors to further improve their work by better clarifying the decay strategy for the temperature in the camera ready and to add a comparison with SGD-R scheduling as pointed out by one of the reviewers. It would be also nice to have a mention on how the proposed approach relates to Latent Weights Do Not Exist: Rethinking Binarized Neural.

low-bit weight, neurips paper, quantized neural network, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.40)

Add feedback

Searching for Low-Bit Weights in Quantized Neural Networks

Neural Information Processing SystemsOct-9-2024, 20:13:05 GMT

low-bit weight, quantized neural network, searching, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.96)

Add feedback

On Expressive Power of Quantized Neural Networks under Fixed-Point Arithmetic

Hwang, Geonho, Park, Yeachan, Park, Sejun

arXiv.org Machine LearningAug-30-2024

Research into the expressive power of neural networks typically considers real parameters and operations without rounding error. In this work, we study universal approximation property of quantized networks under discrete fixed-point parameters and fixed-point operations that may incur errors due to rounding. We first provide a necessary condition and a sufficient condition on fixed-point arithmetic and activation functions for universal approximation of quantized networks. Then, we show that various popular activation functions satisfy our sufficient condition, e.g., Sigmoid, ReLU, ELU, SoftPlus, SiLU, Mish, and GELU. In other words, networks using those activation functions are capable of universal approximation. We further show that our necessary condition and sufficient condition coincide under a mild condition on activation functions: e.g., for an activation function $\sigma$, there exists a fixed-point number $x$ such that $\sigma(x)=0$. Namely, we find a necessary and sufficient condition for a large class of activation functions. We lastly show that even quantized networks using binary weights in $\{-1,1\}$ can also universally approximate for practical activation functions.

expressive power, fixed-point arithmetic, quantized neural network

arXiv.org Machine Learning

2409.00297

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

DeepNcode: Encoding-Based Protection against Bit-Flip Attacks on Neural Networks

Velčický, Patrik, Breier, Jakub, Kovačević, Mladen, Hou, Xiaolu

arXiv.org Artificial IntelligenceJun-2-2024

Fault injection attacks are a potent threat against embedded implementations of neural network models. Several attack vectors have been proposed, such as misclassification, model extraction, and trojan/backdoor planting. Most of these attacks work by flipping bits in the memory where quantized model parameters are stored. In this paper, we introduce an encoding-based protection method against bit-flip attacks on neural networks, titled DeepNcode. We experimentally evaluate our proposal with several publicly available models and datasets, by using state-of-the-art bit-flip attacks: BFA, T-BFA, and TA-LBF. Our results show an increase in protection margin of up to $7.6\times$ for $4-$bit and $12.4\times$ for $8-$bit quantized networks. Memory overheads start at $50\%$ of the original network size, while the time overheads are negligible. Moreover, DeepNcode does not require retraining and does not change the original accuracy of the model.

bit flip, codeword, neural network, (16 more...)

arXiv.org Artificial Intelligence

2405.13891

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Austria > Vienna (0.14)
Europe > Serbia > Vojvodina > South Bačka District > Novi Sad (0.04)
(7 more...)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Frame Quantization of Neural Networks

Czaja, Wojciech, Na, Sanghoon

arXiv.org Machine LearningApr-11-2024

Quantization is the process of compressing input from a continuous or large set of values into a small-sized discrete set. It gained popularity in signal processing, where one of its primary goals is obtaining a condensed representation of the analogue signal suitable for digital storage and recovery. Examples of quantization algorithms include truncated binary expansion, pulse-code modulation (PCM) and sigma-delta (Σ) quantization. Among them, Σ algorithms stand out due to their theoretically guaranteed robustness. Mathematical theories were developed in several seminal works [3-5, 8, 11], and have been carefully studied since, e.g., [14, 15, 19, 27]. In recent years, the concept of quantization also captured the attention of the machine learning community. The quantization of deep neural networks (DNNs) is considered one of the most effective network compression techniques [9]. Computers express parameters of a neural network as 32-bit or 64-bit floating point numbers.

algorithm, neural network, quantization, (13 more...)

arXiv.org Machine Learning

2404.08131

Country:

North America > United States > Maryland > Prince George's County > College Park (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback